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TITIiE 



Peptide and Nucleic Acid Molecule 



FIELD OF INVENTION: 



5 The invention relates to a peptide which is capable of 
cleaving a peptide bond, to a nucleic acid molecule 
encoding the peptide, to a vector and a cell comprising the 
nucleic acid molecule, to a composition or a kit comprising 
the peptide, to a method of making the peptide, to an 

10 antibody which binds the peptide and to a method of 
cleaving a peptide bond using the peptide. 



Serine proteases are a family of protein cleaving enzyrciBS . 

\5 Men-ibex s of this fair.ily have distinct substrate specificity. 
The prolyl oligopeptidases , dipeptidyl peptidase 4 (DPP4) 
and fibroblast activation protein (FAP) are serine 
proteases. DPP4 has substrate specificity for peptides 
which contain the di-peptide sequence, Ala-Pro, and cleaves 

20 a peptide which contains the di-peptide by hydrolysis of a 
peptide bond which is located C- terminal adjacent to 
proline in the di-peptide. DPP4 also has substrate 
'spec rficTty^Tor peptides ^ivh ich "c ohtalTh the" di~- peptTTde 
sequence, Gly-Pro, and cleaves a peptide which contains 

25 this di-peptide by hydrolysis of a peptide bond which is 
located C-terminal adjacent to proline in the di-peptide. 
FAi"' nas a subsLXdU':^ _ f ; • ' ' t y whw-h r: ^ rr^ i : r zo ^n^ 

specif iciLy ui DPF4 , aj^tiiough TAl' ali:^ hac gel^.^ : ■^^sf^^ 
activity . 



The inventors have isolated and characterised a new prolyl 
oligopeptidase and the gene encoding it. The inventors 
have named the new prolyl oligopeptidase DPP4L1 . 



BACKGROUND OF THE INVENTION 



SUMMARY OF THE INVENTION 
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As described herein, the substrate specificity of DPP4L1 is 
distinct from the substrate specificity of other prolyl 
oligopeptidases . 

5 In one aspect, the invention provides a peptide which is 
capable of cleaving a peptide bond which is C-terminal 
adjacent to proline in the sequence Ala-Pro, and which is 
not capable of cleaving a peptide bond which is C-terminal 
adjacent to proline in the sequence Gly-Pro. 

10 

The capacity of DPP4L1 to cleave, or in other words, 
hydrolyse a peptide bond which is C-terminal adjacent to 
proline in the dipeptide sequence Ala-Pro shows that DPP4L1 
is a prolyl oligopeptidase . The inability of DPP4L1 to 
15 cleave a peptide bond which is C-terminal adjacent to 

proline in the dipeptide sequence Gly-Pro shows that DPP4L1 
is a prolyl oligopeptidase with a substrate specificity 
which is distinguished from other prolyl oligopeptidases. 

20 The capacity of a prolyl oligopeptidase to cleave a peptide 
bond which is C-terminal adjacent to proline in the di- 
peptide sequence Ala-Pro, or Gly-Pro, can be determined by 

slznud^ird techrriqries— ai^-^Sest^ifeed-iieifeifi^ Fa^f--exaif^±^eT- -fe^— 

capacity to cleave a peptide bond which is C-terminal 
25 adjacent to proline in the di-peptide sequence Ala-Pro can 
be determined by observing hydrolysis of a peptide bond 
which is C-terminal adjacent to proline in the molecule 
Ala-Pro-p-nitroanilide. The capacity to cleave a peptide 
bond which is C-terminal adjacent to proline in the 
30 dipeptide sequence Gly-Pro can be determined by observing 
hydrolysis of the peptide bond which is C-terminal adjacent 
to proline in the molecule Gly-Pro-p-nitroanilide . In one 
embodiment, the peptide of the first aspect of the 
invention is capable of cleaving the peptide bond C- 
35 terminal adjacent to proline in the compound Ala-Pro-p- 
nitroanilide and is not capable of cleaving the peptide 
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bond C-terminal adjacent to proline in a compound selected 
from the group of compounds consisting of Gly-Pro-p- 
nitroanilide , Gly-Arg-p-nitroanilide , Gly- Pro -p- toluene 
sulphonate and Gly-Pro-7-amino-4-trif louromethyl coumarin . 

5 

The inventors believe that an amino acid sequence, Gly-Trp- 
Ser-Tyr-Gly-Gly which is comprised in the amino acid 
sequence of DPP4L1 described herein, is likely to be 
involved in the enzymatic activity of DPP4L1 . The 

10 inventors further believe that the amino acid sequences, 
Leu-Asp-Glu-Asn-Val-His-Phe-Ala-His and Glu-Arg-His-Ser- 
Ile-Arg which are also comprised in the amino acid sequence 
of DPP4L1 described herein, are likely to be involved in 
the enz^Tiiatic activity of DPP4L1 . Thus in another 

1^ embodiment, the peptide of the first aspect of the 

invention comprises an amino acid sequence Gly-Trp-Ser-Tyr- 
Gly-Gly. In another embodiment, the peptide comprises an 
amino acid sequence Leu-Asp-Glu-Asn-Val -His-Phe-Ala-His . 
In another embodiment, the peptide comprises an amino acid 

20 sequence Glu-Arg-His-Ser-Ile-Arg . 

The biochemical characterisation of DPP4L1 described herein 
shows -that— £>PP4iji eensi^trs of — 8^^- amino^ a-eidis— a nd h as -ar — 

molecular weight of about lOOkDa. Thus in another 
25 embodiment, the peptide of the first aspect of the 

invention consists of about 882 ammo acids and has a 

The inventors recognise that by using standard techniques 
30 it is possible to generate a peptide which is a truncated 
form of DPP4L1 and which retains the substrate specificity 
of DPP4L1 . Thus it is recognised that a peptide which has 
the substrate specificity of DPP4L1 may consist of less 
than 882 amino acids, or may have a molecular weight of 
35 less than lOOkDa. 
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As described herein, the amino acid sequence of DPP4L1 
which is predicted from the nucleotide sequence of the 
nucleic acid molecule which encodes DPP4L1 does not contain 
a consensus sequence for N-linked glycosylat ion . Therefore 
5 the inventors believe that it is unlikely that DPP4L1 is 
associated with N-linked glycosylation . In this regard, 
DPP4L1 is distinguished from other prolyl oligopeptidases 
which contain between 6 and 9 consensus sequences for N- 
linked glycoyslat ion . Thus in a further embodiment, an 
10 asparagine residue in the amino acid sequence of the 

peptide of the first aspect of the invention is not linked 
to a carbohydrate molecule. 

The analysis of DPP4L1 expression described herein shows 
15 that it is likely that DPP4L1 is expressed as a cytoplasmic 
protein. The expression of DPP4L1 is therefore 
distinguished from other prolyl oligopeptidases, which are 
expressed on the cytoplasmic membrane, or in other words, 
the cell surface membrane. Thus in another embodiment, the 
20 peptide of the first aspect of the invention is not 
expressed on a cell surface membrane of a cell. 

^The-^ nv e n tor s iDei reve that a pept ide -which ~ha-s— t±te 

substrate specificity of DPP4L1 can be generated which has 

25 the amino acid sequence of DPP4L1 described herein and 
which contains one or more amino acid deletions, 
substitutions or insertions of that amino acid sequence. 
It is expected that a peptide which is at least 51% 
homologous to the amino acid sequence of DPP4L1 described 

30 herein, or which is at least 27% identical to the amino 
acid sequence of DPP4L1, will retain the substrate 
specificity of DPP4L1 . The % homology can be determined by 
use of the program/algorithm "GAP" which is available from 
Genetics Computer Group (GCG) , Wisconsin . Thus in another 

35 embodiment, the peptide of the first aspect of the 
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invention has an amino acid sequence which is at least 50% 
homologous to the amino acid sequence of DPP4L1 . 

As described herein the inventors characterised the 
5 nucleotide sequence of the nucleic acid molecule encoding 
DPP4L1 and from this, were able to predict the amino acid 
sequence of DPP4L1 . The amino acid sequence of DPP4L1 is 
shown in Figure 1. In an embodiment, the peptide of the 
first aspect of the invention has the amino acid sequence 
10 shown in Figure 1. 

The inventors recognise that DPP4L1 may be fused, or in 
other words, linked to a further amino acid sequence to 
form a fusion protein which retains substrate specificity 
15 of DPP4L1 . An example of a fusion protein is described 
herein which comprises the amino acid sequence of DPP4L1 
which is linked to a further ''tag" sequence which consists 
of an amino acid sequence encoding the V5 epitope and a His 
tag. An example of another fusion protein which comprises 

20 the amino acid sequence of DPP4L1 is a GST fusion protein. 
Thus in another embodiment, the peptide of the first aspect 
of the invention is linked to a further amino acid 

— -sequenee- 



25 The inventors further recognise that the amino acid 

sequence of DPP4L1 shown m Figure 1 may oe comprised m a 

^ppr i f "i r i t y of DPP4L1 . The pol\nDept- i de may he useful, for 
example, to alter the protease susceptibility of the DPP4L1 
30 amino acid sequence. Thus in another embodiment, the 

peptide of the first aspect of the invention is comprised 
in a polypeptide which has the substrate specificity of 
DPP4L1 , 
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In a second aspect, the invention relates to a nucleic acid 
molecule which encodes a peptide according to the first 
aspect of the invention. 

5 As described herein, the inventors bel ieve that the aene 
which encodes DPP4L1 is located at band q22 on human 
chromosome 15. The location of the DPP4L1 gene is 
distinguished from genes encoding other prolyl 
oligopeptidases , which are located on chromosome 2, at 
10 bands 2q2 4.3 and 2q2 3, or chromosome 7. Thus in an 

embodiment, the nucleic acid molecule of the second aspect 
of the invention is capable of hybridising to a gene which 
is located at band q22 on human chromosome 15. 

15 The inventors have characterised the nucleotide sequence of 
the nucleic acid molecule encoding DPP4L1 . The nucleotide 
sequence of the nucleic acid molecule encoding DPP4L1 is 
shown in Figure 1. Thus in an embodiment, the nucleic acid 
molecule of the second aspect of the invention has the 

20 nucleotide sequence shown in Figure 1. 

The inventors recognise that a nucleic acid molecule which 
ha:s "the- -ntioi eot: i d e- -g^qti e nc e shown -±rt Fi^u:r«~-l-~eetiid-be-fflade - 
by producing only the fragment of the nucleotide sequence 
25 which is translated. Thus in an embodiment, the nucleic 
acid molecule of the second aspect of the invention does 
not contain 5' or 3 ' untranslated nucleotide sequences. 

As described herein, the inventors observed at least three 
30 splice variants of DPP4L1 RNA which are of from 2.6 to 3.1 
kb in length. As a frame shift mutation or termination 
signal was not observed in the nucleotide sequence of these 
splice variants, and as the coding sequence of two of the 
splice variants include sequence which encodes the DPP4L1 
35 amino acid sequence which is believed to be associated with 
enzymatic activity, the inventors believe that the splice 




-8- 

variants are likely to have the substrate specificity of 
DPP4L1. Thus in another embodiment, the nucleic acid 
molecule of the second aspect of the invention is a 
fragment of the nucleotide sequence of DPP4L1 shown in 
5 Figure 1 which is about 2.6 to 3.1 kb in length and which 
encodes a peptide according to the first aspect of the 
invention . 



In another embodiment, the nucleic acid molecule of the 
10 second aspect of the invention is selected from the group 
of nucleic acid molecules consisting of T21, T8 , Race 
product, ATCd3-2-l and ATCd3-3-10, as shown in Figure 1. 

In a third aspect the invention provides a vector which 
15 comprises a nucleic acid molecule according to the second 
aspect of the invention. 

In one embodiment, the vector of the third aspect of the 
invention is capable of replication in a COS-7 cell or 
20 E.coli. In another embodiment, the vector is selected from 
the group consisting of /\.TripleEx, pTripleEx, pGEM- 
TEasyRVector and pCDNA3 . 1 /V5 /His . 

In a fourth aspect, the invention provides a cell which 
25 comprises a vector according to the third aspect of the 
invent ion . 



invention is an E.roli cell. Preferably, the E. coli is 
30 EM25.S. In another embodiment, the cell is a COS-7 cell 



In a fifth aspect, the invention provides a method for 
making a peptide according to the first aspect of the 
invention which comprises the step of maintaining a cell 
35 according to the fourth aspect of the invention in 

conditions in which the peptide is expressed by the cell. 
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In one embodiment, the method of the fifth aspect of the 
invention comprises the further step of isolating the 
peptide . 

5 

In a sixth aspect, the invention provides a peptide when 
produced by the method of the fifth aspect of the 
invention . 

10 In a seventh aspect, the invention provides a composition 

comprising a peptide according to the first or sixth aspect 
of the invention and a pharmaceutically acceptable carrier. 

In an eighth aspect, the invention provides an antibody 
15 which is capable of binding a peptide according to the 
first or sixth aspect of the invention. 

In one embodiment, the antibody of the eighth aspect of the 
invention is secreted by a hybridoma cell. 

20 

In a ninth aspect, the invention provides a hybridoma cell 
which secretes an antibody according the eighth aspect of 
"the" -invent: ioTTT ~ 

25 In a tenth aspect, the invention provides a method of 

cleaving a molecule which comprises a di-peptide sequence 
Ala-Pro at a peptide bond which is C-terminal adjacent to 
proline in the di-peptide, the method comprising 
maintaining the molecule in the presence of a peptide 

30 according to the first aspect or the sixth aspect of the 
invention so that the peptide bond C-terminal adjacent to 
proline in the di-peptide is cleaved. 

In one embodiment of the tenth aspect of the invention, the 
35 molecule further comprises the di-peptide sequence, Gly- 
Pro . 
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In an eleventh aspect the invention provides a kit 

comprising the peptide of the first or sixth aspects of the 

invention, or an antibody according to the eighth aspect of 
5 the invention. 

BRIEF DESCRIPTION OF THE FIGURES 

Fig 1. Cloning strategy for isolating full-length DPP4L1 
cDNA and the alternative splicing variants of DPP4L1 
10 observed. Representation of three splice variants is shown 
including loss of serine recognition site by one splice 
variant (T8) . 

Fig 2. Nucleotide sequence and amino acid sequence of human 
15 DPP4L1 . The nucleotide and predicted one letter code amino 
acid sequence are shown. This sequence shows no putative 
membrane spanning domain (deduced from hydrophobici ty 
plots) or potential N-linked glycosylation sites. The 
putative serine recognition site and aspartic acid and 
20 histidine which form the SER-ASP-HIS catalytic domain are 
marked. Base pairs are numbered in the right margin. 

" Fip- 3 . —Al ignmen t -o± tiie ^redi-et^ -pr^-eii^-geqtaenee - of 

DPP4L1 with human DPP4 and C elegans homologue . The amino 
25 acid sequences were aligned using PileUp alignment program 
in GCG. Amino acid residues identical m all three proteins 
^ i' (■ rv' :xca . 

Fig 4. Northern Blot analysis of DPP4L1 expression. Human 
^0 multiple tissue Northern blots (CLONTECH) containing 2 ug 
per lane of poly A RNA were hybridized with a ^^P labeled 
DPP4L1 probe at 68°C and washed at high stringency. The 
autoradiograph was exposed for 1 day at -VC^C with a BIOMAX 
MS screen. Molecular mass markers are indicated in base 
35 pairs on the left side of each autoradiogram . 
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Fig 4a. Master RNA (CLONTECH) blot of poly A RNA was 
hybridized with a ^^P labeled DPP4L1 probe at 65^C and 
washed at high stringency. The au toradiograph was exposed 
for 3 day at -70°C with BIOMAX MS screen. DPP4L1 inRNA was 
detected in all tissues examined. 

Fig 5. Chromosomal localization of human DPP4L1 . Metaphase 
showing FISH with the biotinylated DPP4L1 cDNA probe. 
Normal male chromosomes stained with DAPI . Hybridization 
sites on chromosome 15 are indicated by an arrow. 

Fig 6. Western blot analysis of transfected cell lines. 
Analysis of lysates of stable cell lines. DPP4L1 protein 
was seen in DPP4L1 /V5/His stable cell line but not in DPP4 
or vector only stable cell lines. The electrophoretic 
mobility of the protein was not altered when samples were 
boiled. The band of greater mobility was probably a 
breakdown product of intact DPP4L1 . 

Fig 7. Human DPP4L1 confered Ala-Pro DPP activity upon COS 
cells transfected with DPP4L1 cDNA. 

~ '---^ — Detiecrt i on: -cxf ^»F41j i- expre ssix>n in-eog -? cells by 

fluorescent staining and phase contrast microscopy. 

DETAILED DESCRIPTION OP THE INVENTION 

EXAMPLES 
General 

Restriction enzymes and other enzymes used in cloning were 
obtained from Boehringer Mannheim Roche. Standard molecular 
biology techniques were used (30) unless indicated 
otherwise . 

An EST clone (GENBANK™ accession number AA417787) was 
obtained from American Type Culture Collection. The DNA 
insert of this clone was sequenced on both strands using 
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automated sequencing at SUPAMAC (Sydney, Australia) 



DPP4L1 Cloning 

ESTAA417787 was used to design forward (caa ata gaa att gac 
gat cag gtg) and reverse (tct tga agg tagtgc aaa aga tgc) 
DPP4L1 primers for polymerase chain reaction (PGR) from 
ESTAA417787. The PGR conditions were as follows: 94°G for 5 
min, followed by 35 cycles of 94''G for 1 minute, 55°C for 
30 sec and 70'^G for 1 min . This 484 bp PGR product was gel 
purified, 32 P- labeled using Megaprime Labeling Kit 
(Amersham Pharmacia Biotec, UK) and hybridized to a Master 
RNA blot (GLONTEGH, Palo Alto, GA, USA) that contained poly 
A^ from 50 adult and fetal tissues immobilized in dots as 
per manufacturers' instructions. This Master RNA blot was 
also probed with DPP 4 for comparison of mRI^JA tissue 
expression . 

The forward and reverse DPP4L1 primers were used for PGR 
to screen a human placental X STRETCH PLUS library 
(GLONTEGH, Palo Alto, CA, USA) for the presence of DPP4L1 
cDNA in the library. The library was then screened by 
standard molecular biology techniques. After primary 
-screenings, "23- " cl one's " we r e " s elected Tor "sec"b^ scre'ening 
after which 22 remained positive. For the tertiary screen 
the clones contained in ^TripleEx were converted into 
pTriplEx plasmids and transformed into BM25.8 E. coli 

-L L_ NA/d^ ^ i ^ j. iiitrw LiiciL dl^ 22 v^jLLjiitrS wfeic pLJiD J. L i v'tr . Two ui 

these clones, T8 and T21 were selected for further study. 
5^ RACE (Rapid amplification of cDNA ends) 

A 5' RACE Version 2.0 kit (Gibco BRL, Life technologies) 
was applied on activated T cell (ATC) and placental RNA as 
prescribed in the kit instructions. The T8 DNA sequence was 
used to design GSP 1 (TGC TTG GTT GAG GAT GAA TG) and GSP2 
{CTT AAA ACT GAC TTT AGG ATT TGC TGT AGG) . 5' RAGE PGR 
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products were cloned into pGEM-T Easy®Vector ( Promega Co., 
Madison, WI , USA) and sequenced by primer walking. 

Confirmation of identity of RACE product 

Reverse transcriptase PCR was carried out on ATC RNA using 
DPP4Ll-pr2 3 (GGA AGA AGA TGC CAG ATC AGC TGG ) and DPP4L1- 
prl9r (TCC GTG TAT CCT GTA TCA TAG AAG) to span across the 
junction between the RACE product and the EST and library 
clones. Two gel purified products ATCd3-2-l (1603bp) and 
ATC3-3-10 (1077bp) were cloned into pGEM-T Easy® (Promega 
Co., Madison, WI , USA) and sequenced. 

Subcloning of DPP4L1 cDNA into a pcDNA3 . 1/V5/His Expression 
Vector 

The ATC RACE product, the ATCd3-2-l {1603bp) junction 
fragment and the library clone T21 were joined together and 
cloned into the expression vector pcDNA3 . 1 /V5 /His A 
(Invitrogen) to form a DPP4L1 cDNA of 3.1 kb with an open 
reading frame of 882 aa . The first construct was made using 
three sequential cloning steps. Firstly, a Eco RV/Xba I 
fragment of T21 (containing 3' DPP4L1, stop codon and 3' 
untranslated region on DPP4L1 cDNA) was ligated into the 
"vector pcDNAB ;T/V57His"rAr'wTiich^T^ 

RV/Xba J. An Eco RI/Eco RV fragment of ATCd3-3-l was then 
added to this construct digested with Eco RI/Eco RV. 
Finally the RACE product was cut with Eco RI and cloned 
into the Eco RI site of the previous construct to form the 
complete 3 . 1 kb DPP4L1 cDNA. This construct pcDNA3 . 1-DPP4L1 
expressed protein with no detectable tag. In addition the 
stop codon in the DPP4L1 expression construct in 
pcDNA3 . 1/V5/His V5 was genetically altered using PCR to 
create a C-terminal fusion with the V5 and His tag 
contained in the vector. This construct was named pcDNA3.1- 
DPP4L1/V5/His . All expression constructs subcloned into 
pcDNA3 . 1/V5/His were verified by full sequence analysis. 
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DPP4L1 gene expression by Northern Blot 

Human multiple tissue Northern blots (CLONTECH) containing 
2 ug of poly RNA were prehybridized in Express 
Hybridization solution (CLONTECH) for 30 min at GS^'C. 
5 Both the DPP4L1 484 bp product and the 5' RACE ATC product 
were radiolabeled using a Megaprime Labeling kit (Amersham 
Pharmacia Biotech) and [3 2P]dCTP (NEN Dupont) . 
Unincorporated label was removed using a NICK column 
(Amersham Pharmacia Biotech) and the denatured probe was 
10 incubated for 2 hrs at 68°C in Express Hybridization 

solution. Washes were performed at high stringency and 
blots exposed to BIOMAX MS film for overnight with a BIOMX 
MS screen at -70°C. 

I s DPP4L1 expression by R T - PCR 

Reverse transcriptase PCR was performed on human ATC RNA, 
human placental RNA and human liver RNA using TED primers 
DPP4Ll/pr3 (GCA CTA CCT TCA AGA AAA CCT TGG) and 
DPP4Ll/pr2 0R (TAT GGT ATT GCT GGG TCT CTC AGG) to give a 

20 2 93 bp product. 

Transient Transfection into COS cells 

Monk e y k di^ney- f ^r oblas -t— < CQG - 7 ) cel ^^H ATee~-eRir-i^5ih7 

were grown in Dulbecco's MEM medium supplemented with 10% 

25 fetal calf serum and 2mM glutamine . A subconfluent 75 cm^ 
flask of COS cells was transfected using 15 ug DNA and 48 

hrs before harvesting. For making stable cell lines, 
30 Geneticin (G418, Gibco BRL) was added 24 hrs after 
transfection and cells were maintained and grown 
continuously in media containing G418 selection. 



35 



Determination of DPPactivity of DPP4L1 

DPP4 enzyme assays were performed on trypsin/EDTA-harvested 
COS-7 cells 72 hrs after transfection and used Gly-Pro-p- 



m 
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35 
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nitroanilide-p"-toluene sulfonate salt (Sigma, St Louis, MO, 
USA) [Duke-Cohan, 1996 #1406], Ala-Pro-p-nitroanilide HCl 
(Bachem, Switzerland) or Gly-Arg- p-ni troanilide HCl 
(Sigma) as the substrates. Transfected cells were lyzed by 
sonication then incubated at 20,000 cell equivalents per 
well in 70|il phosphate buffer, pH7 . 0 for 40 minutes at 
37°C. Absorbances at 690nm were subtracted from 
absorbances at 405nm to increase the specificity of 
measurements. Analyses of Michaelis Menten kinetics used 
KaleidaGraph (Hearne Scientific Software) . Assays were 
performed in triplicate on two transf ections . 



Chromosomal localization of DPP4L1 by Fluorescence in situ 
Hybridization (FISH) analysis 

DPP4L1 was localized using two different probes, the DPP4L1 
EST and the T8 clone. The probes were nick- translated with 
biotin-14-dATP and hybridized in situ at a final 
concentration of lOng/ul to metaphases from two normal 
males. The FISH method was modified from that previously 
described (31) in that chromosomes were stained before 
analysis with both propidium iodide (as counterstain) and 
DAPI (for chromosomal identification). Images of metaphase 
"P^epdi i citrion^s—wreire -?japtrared t>y a- cool e^- CCD "ca mera using the " 
Cyto Vision Ultra image collection and enhancement system 
25 (Applied Imaging Int Ltd) . FISH signals and the DAPI 
banding pattern were merged for figure preparation. 

Molecular cloning and sequence analysis of DPP4L1 
The insert in ATCC EST AA417787 was 805 bp in length, 
containing 53 7 bp of coding sequence, a TAA stop codon and 
267 bp of 3' noncoding sequence (Figure 1). 

The hybridization of the Master RNA blot revealed that the 
gene comprising ESTAA417787 has ubiquitous tissue 
expression, with high levels of expression in testis and 
placenta. Based on this expression pattern, a placental 
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cDNA library was screened with a 484 bp PGR product 
produced by the forward and reverse DPP4L1 primers. 
Sequence homology analysis revealed that only 2 of 2 3 
clones contained 5' sequence additional to the sequence of 
ESTAA417787. These cDNA clones were designated T8 and T21, 
and were 1.7 kb and 1.2 kb respectively (Figure 1). In 
addition, comparison of these sequences to ESTAA417787 
revealed that T8 cDNA lacked a 153 bp (Blaa) region that 
was present in T21 cDNA and ESTAA417787. This deletion 
would result in the loss of the catalytic serine (GWSYGG) 
in T8 cDNA. Many of the other clones characterized appeared 
to contain unrelated sequence which are probably intronic 
sequences as a result of incomplete splicing. 

The 5' RACE technique was utilized on both ATC RNA and 
placental RNA to obtain the 5' of end of the DPP4L1 gene. 
The RACE product obtained from activated T cell RNA was 0.2 
kb larger than that from placental RNA but otherwise 
identical (Figure 1). The first methionine within a Kozak 
sequence was found 211 bp from the 5' end of the activated 
T cell RACE product. This 5' 211bp region was 70.5 % GC 
rich and contained a number of potential promoter and 

-enhaaeer- element -(-Sp3r / -Apl— a n d E ^y— si t es ) - and - so "wa^- 

deduced to be the 5' flanking region of the DPP4L1 gene. In 
order to confirm the identity of the 5' RACE product as the 
5' end of DPP4L1, RT-PCR was carried out to span across cne 

r-lor.f^ Thp PT-PCP on ATC RNA produced two clones ATCd^-2-1 
and ATC3-3-10 (Figure 1) . Compared to T8 and T21, both 
clones had an additional insert region of 144bp (48 aa) 
immediately adjacent to the splice site of T8 . Sequence 
homology analysis of this additional insert region found a 
homologous region in both the C. elegans homologue and 
DPP4 . This clearly showed that T8 and T21 library clones 
represented splice variants of DPP4L1 . The smaller clone 
ATCd3 3-10 was also found to represent another splice 



10 



15 



m 
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variant of DPP4L1 as it contained a 516 bp deletion at the 
5' end which would result in a deletion of 175 aa . At this 
point it is unclear about the biological significance of 
three different splice variants observed. 

A full-length DPP4L1 clone was created using the larger 
RACE product, ATC3-2-1 and the T21 library clone. This 
generated a putative DPP4L1 cDNA of 3 . 1 kb (including 5' 
and 3' untranslated regions) with an open reading frame of 
882 aa for further sequence analysis and examining DPP4L1 
function. This 882 putative DPP4L1 protein contained no N- 
linked glycosylation sites and Kyte-Dooli ttle 
hydrophobicity analyses revealed it lacked a transmembrane 
domain, unlike DPP4 , FAP and DPP6 . Thus it is likely that 
DPP4L1 is a cytoplasmic protein (Figure 2). The predicted 
DPP4L1 protein shared 51 % amino acid similarity and 27 % 
amino acid identity with human DPP4 ; the C termini of these 
proteins exhibited the most homology (Figure 3) . 



20 Tissue distribution of DPP4L1 as determined by Master RNA 
and Northern Blot 

A master RNA blot was probed with a 484 nt PGR product 

prxrdtrced by- l:he ±or^ plni ers as — 

mentioned previously. The mRNA tissue expression of DPP4L1 

25 was ubiquitous in all human adult and fetal tissues. A 
similar ubiquitous expression pattern was observed using 
DPP4 cDNA as a probe (data not shown) . However, by visual 
assessment the greatest levels of expression using each 
gene specific probe were in different tissues. The most 

30 intense signals using the DPP4L1 probe were in testis 

followed by placenta whereas the most intense signals usinc 
the DPP4 probe were in salivary gland and prostate gland 
followed by placenta (data not shown) . The probes did not 
bind any of the negative controls on the blot. 

35 



Northern blot analysis was performed on mRNA derived f 
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different human tissues (Figure 4) . Two DPP4L1 specific 
probes indicated the presence of transcripts in all tissues 
examined. A transcript approximately 3.0 kb in size 
consistent with the approximate expected size of DPP4L1 
5 message was detected only in the testis. However, two 

transcripts of 8.0 and 5 . 0 kb respectively were present in 
testis, spleen, peripheral blood leukocytes and ovary at 
high levels; in prostrate, small intestine, and colonic 
mucosa at moderate levels; and in the thymus at lower 
10 levels. The Multiple tissue Northern blot was also probed 
with radiolabeled human P-actin probe and a common 2.0 kb 
transcript was seen in all tissues (Figure 4) . 

Expression and functional activity of DPP4L1 
1^ To assess the function of DPP4L1 protein, the full length 
DPP4L1 cDNA of 3.1 kb was cloned into the Xha I site of 
pcDNAB . 1A/V5/His expression vector to produce two 
constructs. The first construct, pcDNAB . 1 -DPP4L1 , expressed 
DPP4L1 protein on its own whilst the second construct, 
20 pcDNA3 . 1-DPP4L1/V5/His expressed a protein with the V5 

epitope and His tag fused to the C-terminus of DPP4L1 to 
facilitate analysis of protein expression. Mammalian 

expr ess i on- eon s trru c were stably trransTected^'lnto COS-7 ~ 

cells and cellular sonicates prepared. Consistent with the 
25 molecular weight predicted from the amino acid sequence a 
100 kDa monomer was detected by Western blotting of stable 

protein was det e-':"^ •^''^ Cj^'^ — pi a "i ,^ompar^TD<:^^:^ "hvit r-.n-.f- 

on the surface of ethanol fixed stable DPP4L1 /V5 /His 

30 expressing COS cells, using the anti-V5 mAb . Due to 

homology between DPP4 and DPP4L1 cell lysates were examined 
for serine protease activity. Expression of DPP4 with and 
without the V5 and His tags in COS cells was performed as a 
positive control and to establish the working conditions of 

35 the assay. Homogenates of vector-only transf ec tions were 

used in parallel as negative controls. Extracts of DPP4L1- 
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transfected cells hydrolyzed Ala-Pro-p-nitroanilide but 
Gly-Pro-p-nitroanlide, Gly-Arg-p-ni troanilide , Gly-Pro- 
toluene sulphonate or Gly-Pro-7-amino-4- trif luoromethyl 
coumarin . 



Chromosomal localization of DPP4L1 

Two probes were used for FISH analysis, ESTAA417787 and the 
T8 clone from the placental library. Seventeen metaphases 
from the first normal male were examined for fluorescent 
signal. All of these metaphases showed signal on one or 
both chromatids of 15 at band q22 (Figure 5) . There were a 
total of 2 non-specific background dots observed in these 
metaphases. A similar result was obtained form the 
hybridization of the probe to 15 metaphases from the second 
normal male (data not shown) . 

We describe a novel human POP that we have called DPP4L1 
protein and the gene encoding it. Analysis of the open 
reading frame of the complete DPP4L1 cDNA sequence 
suggested that it is a cytoplasmic protein. Hydropathy 
analysis indicated that in contrast to DPP4 , FAP and DDP6 
genes, DPP4L1 does not contain a short hydrophobic region 

to act . .as membraxie -^panning-^domain . Human DPP4 , -FAP and 

DPP6 contain between 6 and 9 potential N-glycosylation 
sites in their amino acid sequence. A similar examination 
of DPP4L1 cDNA sequence revealed that it had no sites of 
this type which was further indication that DPP4L1 was a 
cytoplasmic protein. The detection of tagged DPP4L1 protein 
in the cytoplasmic compartment but not on the surface of 
transiently transfection COS cells, using the anti-V5 mAb, 
further suggested that DPP4L1 is a cytoplasmic protein. 

The most significant homology between DPP4L1 and DPP4 is in 
the C termini where the three catalytic residues Ser, Asp 
and His are located. By homology with DPP 4, DPP4L1 is a 
member of the DPP 4-like gene family, a member of the POP 
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family and a member of the a/P hydrolase fold family (32) . 
The catalytic residues in DPP4L1 that potentially form the 
charge-relay system are Ser^^^, Asp^^^ and His®^^ . 
Transfection experiments were performed with constructs of 
5 DPP4L1 cDNA to demonstrate its ability to behave as a 
serine protease and exhibit DPP enzyme activity. DPP4L1 
cDNA constructs conferred DPP enzyme activity to cellular 
homogenates as demonstrated by their ability to hydrolyze 
the substrate Ala-Pro. However, constructs of DPP4L1 did 
10 not confer activity against Gly-Pro upon transfected COS 
cells, indicating that DPP4L1 has a different substrate 
specificity to DPP4 . The physiological role of hydrolysis 
of Ala-Pro is unknown. 

when nPP4 is expressed on the surface of T cells it is know 
as the cell surface antigen CD26. CD2 6-negative cell lines 
have been shown to have residual DPP4 activity, indicating 
the existence of alternative peptidase with DPP4 activity, 
DPP4P is protein which shows a peptidase activity similar 
to DPP4 and has been purified from the CD26-negative cell 
line C8166 (27, 28) . Purified DPP4p, cleaves Gly-Pro 
substrate and is a glycosylated protein that exists on the 
"cell surTace as 7 0-80 kDa ' monomer / Th'ereYore , a c c o rd i n g^~ t o 
the substrate specificity, cellular localization and 
biochemical properties DPP4L1 is novel DPP distinct from 
DPP4p 

During the cloning of DPP4L1 it became apparent that at 
least three alternately spliced transcripts of DPP4L1 other 
3u than full-length are present in tissues exam.ined. The 
biological significance of such transcripts is so far 
unknown. None of the three splice variants result in a 
frame shift or premature protein termination so can 
potentially produce intact but truncated DPP4L1 proteins. 
35 Two of the three splice variants contain all the catalytic 
triad residues and thus may still produce proteins with DPP 
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activity. It is possible that expression of these sequences 
may be used to regulate the levels of active protein. In 
addition, analysis of DPP4L1 tissue distribution by 
Northern hybridization revealed a number of differently 
sized transcripts. However the size of these transcripts 
did not concur with those expected to be seen from 
alternate splicing. The predicted size of alternate spliced 
variants of DPP4L1 would range in size from 2.6 -3.1 kb 
whereas the large transcripts seen in most tissues examined 
in the Northern blot were 8.5 and 5.0 kb in size 
respectively. These transcript sizes are much larger than 
the 3.1 kb transcript predicted for DPP4L1 from the cloning 
strategy. These large transcripts may contain 5' and 3' 
untranslated sequences and therefore may still encode 
functional DPP4L1 protein. However, it is also possible 
that these transcripts represent incompletely spliced mRNA 
transcripts and therefore do not produce intact DPP4L1 
protein. Further work will determine the role of DPP4L1 in 
different tissues and whether alternative splicing has any 
20 biological role. 

Using FISH analysis to determine the chromosomal 

l?_caljLzaAion„_o^. D_PP4^^ signal -4:^n-^hrQmo&ome- 

15q22 for DPP4L1. Both DPP4 and FAP have been localized to 
the long arm of chromosome 2, 2q24.3 (33) and 2q23 (34) 
respectively. DPP6 which is further in sequence from DPP 4 
and FAP was localized to chromosome 7(21). The localization 
of DPP4L1 to 15q22 predicts that DPP4-like gene family 
members could be spread throughout the human genome, and 
may be present on other chromosomes. The structure of a 
gene in C. elegans which encodes an amino acid sequence 
which is homologous to DDP4L1 has 19 exons spanning 5.3 kb. 
In C. elegans DPP4L1, the serine recognition site, GWSWGG, 
is found in exon 16 and does not span two exons as found in 
the genes for C. elegans and human DPP 4(6), and human and 
mouse FAP (25) . The serine recognition site for C. elegans 
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PEP is also found in one exon therefore this arrangement 
may be representative of the ancestral POP gene and the 
arrangement in DPP 4 and FAP may have resulted from 
divergent evolution from this ancestral gene. 

In summary we have identified and characterized a novel 
human POP DPP4L1 that exhibits DPP activity and the gene 
encoding it. 



THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS: 



1. A peptide which is capable of cleaving a 
peptide bond which is C-terminal to proline in the sequence 
Ala-Pro, and which is not capable of clearing a peptide 
bond which is C-terminal to proline in the sequence 
Gly-Pro . 

2. A peptide according to claim 1 which is 
capable of hydrolysing Ala-Pro-p-ni troanilide , and which is 
not capable of hydrolysing a molecule selected from the 
group consisting of Gly-Pro-p-ni troanilide, Gly-Arg-p- 
nitroanilide, Gly-Pro-toluene sulphonate and Gly-Pro-7- 
amino-4-trif luromethyl coumarin. 

3 . A peptide according to claim 1 or 2 which 
comprises the amino acid sequence, Gly-Trp-Ser-Tyr-Gly-Gly . 

4. A peptide according to any one of the 
preceding claims which comprises the amino acid sequence 
Leu-Asp-Gly-Asn-Val-His-Phe . 

5. A peptide according to any one of the 
preceding claims which comprises the sequence Glu-Arg-His- 
Ser-Ile-Arg . 

6. A peptide according to any one of the 

. pr eceMng . c laims.__whicJ^ ^bout- -1-0&- - 

kDa. 

7. A peptide according to any one of the 
preceding claims which consists of about 880 amino acids. 

8. A peptide according to any one of the 
preceding claims wherein Asn in the amino acid sequence of 
the peptide is not linked to a carbohydrate. 

9 . A peptide according to any one of the 
preceding claims, wherein the peptide is not expressed on a 
cell surface membrane of a cell. 

10. A peptide according to any one of the 
preceding claims, wherein the peptide has an amino acid 
sequence which is at least 50% homologous to the amino acid 
sequence shown in Figure 1 . 
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11. A peptide according to any one of the 
preceding claims which has the amino acid sequence shown in 
Figure 1 . 

12 . A peptide according to any one of the 
preceding claims which is linked to a further amino acid 
sequence . 

13 . A polypeptide comprising a peptide according 
to any one of the preceding claims. 

14. A nucleic acid molecule which encodes a 
peptide according to any one of the preceding claims. 

15. A nucleic acid molecule according to claim 
14 which is capable of hybridising to a gene which is 
located at band q 22 on human chromosome 15. 

16. A nucleic acid molecule according to claim 

15 14 or 15 which has a nucleotide sequence shown in Figure 1. 

17. A nucleic acid according to any one of 
claims 14 to 16 which does not contain 5' or 3 ' 
untranslated regions . 

18. A nucleic acid molecule according to any one 
20 of claims 14 to 17 which is selected from the group 

consisting of T21, T8 , Race product, ATCd3-2-l and ATCd3-3- 

10 as shown in Figure 1, 
- - — 19_.__ A. .vei::.tQr comprising, a -nucleic~aGid-«K>l^^le 

according to any one of claims 14 to 18. 
25 20. A vector according to claim 19 wherein the 

vector is selected from the group consisting of /iTripleEx, 



claim 19 or 20 . 

'^^ ^'^ A cell according to claim. 21, wherein the 

cell is BM25.8 E, coli or COS-7 . 

23 . A method for making a peptide according to 
any one of claims 1 to 13, the method comprising the step 
of maintaining a cell according to claim 20 in conditions 
35 in which the peptide is expressed by the cell. 

24. A method according to claim 23 comprising 
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the further steps of isolating the peptide. 

25. A peptide when produced by the method of 
claim 23 or 24 . 

26. A composition comprising a peptide according 
5 to any one of claims 1 to 13 or 25 and a pharmaceutically 

acceptable carrier . 

27. An antibody which is capable of being a 
peptide according to any one of claims 1 to 13 or 25. 

28. A hybridoma which ???? an antibody according 
10 to claim 27. 

29. A method of clearing a peptide bond in a 
molecule, which is C-terminal to proline in the sequence 
Ala-Pro, the method comprising maintaining the molecule in 
the presence of a peptide according to any one of claims 1 
to 13 or 25 so that the peptide bond is cleared. 
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I --^^'^AAAGCCTCCGA3GCCAAGGCCGC^CTACTGCCGCCGCTGC7^AGT^ 
:01 AGCCACTGCAACCAGGACCGGAGTGGAGGCGCCGCAGCATGAAGCGGCGCAGGCCCGCTCCATAGCGCACaTCGGCAC 

2CI ^^AAAATGCAACATGGCAGCAGCAAT>3GAAACAGAACAGCTGGG^ 3 

MAAAME.EOLGVEIFETADCEENIESQDP. 29 
301 CCTAAArrGGAGC™ATG^AGCGGTAT^CT^ 

^^^■^SWSQLKKLLACTF. KYHGYMMAKA 63 

u«v^:>oiyHVKDGGP(JGFTOQPLRPNLVETSCP i96 

1001 <=ATATTCTGGC^ATTGCTGGT=^CAA*AG^ ^^^^^ 

1101 '^"*'^*TGTTACATCCCCTATGTTGGAAACAAGGA^^ ^^^^ 

1201 ^AGAAATAATGAT^ATGCTGAAGGAAGGATCATAGATGTCAT^^^^ ^^^^ 

1301 CCAGAGCTGGATGGACTCCTGAGGGAAAATATG^^ ^^^^ 

^.WTPEGKYAWSILLDRSQTRLQIVLISPELF 396 

1«I TJ^CCAOTAGAAGAT«TGrrATGGAAAG<^AGAOACTCA^^^ ^^^^ 

vncKUKLlESVPDSVTPLI lYEETTDl 429 

vffybHEEEIEFIFASECKTGFRHL 463 
1601 "^CAAAATTACATCTATTTT-AAAGGAAAGCAAATATA^^^ 

'^■''■'^^SKYKRSSGGLPAPSDFKCPIKEEI 496 
1701 -CAA^CCAGTGGTCAATGGCAAGrrcrTGGCCGGCATGGAT^^ 

1801 ---7-™-^tacg™gtcagttacgtaa^^ 

1901 ^^"^'^^CrrCTTrATAAGTAAGTATAGTA^^^ 

'^'^TT^7TT''^''t''1I^1^''^'TT^^^ "00 

-2ror-T7^TiiTC5G^^ 

I.IKFMDLQP1.KKYPTVLFIYGGPQV0LVNN 663 

.301 -j"™"rrT'rrrs'TT*?^^"?"rTT^'^^*r"r™A^ 

730 L D R V G 1 H ^■■■H ^^f^-SLHALMORSDIFRVAIAGA 763 

.SOI ccccag^;-^^.™™a™^ 

''^^'^o^f^E'T^r'p'T'^"''^'^'^'^*^'^'''^'^*"'^^ 2700 
«0AEKFPSEPNR1.LLLHGFL@ENVHFAHTSII.L 829 

.701 ^=i™agtgagggctggaaagccatatgat^^^ ^^^^ 

2801 *T^^CACTACCTTCAAGAA«c™TCACGTArr^ 

2901 TTTAACCAAATGAGGAGGTTTAATCAACAGAAAACACAGAATTGATCATCACATrrTGATACCTW^ 3000 

3001 CCATGGAOGGGTCTACGGTTTGTGCTAGTAATCTAATACCTTAACCCCACATGCTCA^ 3j00 

3101 AGAATTACTAAAAAAAAAAAAAAAAAA 3127 
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